Goto

Collaborating Authors

 powerful model




The Download: meet Cathy Tie, and Anthropic's new AI models

MIT Technology Review

Since the Chinese biophysicist He Jiankui was released from prison in 2022, he has sought to make a scientific comeback and to repair his reputation after a three-year incarceration for illegally creating the world's first gene-edited children. One area of visible success on his come-back trail has been his X.com account. Over the past few years, his account has evolved from sharing mundane images of his daily life to spreading outrageous, antagonistic messages. This has left observers unsure what to take seriously. Last month, in reply to MIT Technology Review's questions about who was responsible for the account's transformation into a font of clever memes, He emailed us back: "It's thanks to Cathy Tie." Tie is no stranger to the public spotlight.


eufy launches the world's first robot vacuum with a portable deep cleaner (plus other powerful model)

PCWorld

Are you ready to completely overhaul your floor cleaning routines? The eufy E28 and E25 have just landed, and we bet you're going to want one of these robotic cleaners delivered to your home sooner rather than later. Under pre-sale now, be sure to order and save your spot in line with these powerful units. The E25 and E28 are two brand-new models unveiled by the company. Both feature eufy's award-winning HydroJet mopping technology and deliver a jaw-dropping 20,000Pa suction power for a deep clean.


How Do You Measure AI?

Communications of the ACM

Millions of people use artificial intelligence (AI) tools like ChatGPT daily to do everything from generating code to drawing images to creating business ideas. Those AI tools appear to be getting better. Back in November 2022 when it was launched, ChatGPT was powered by GPT-3.5, at the time the most powerful model offered by OpenAI. Yet GPT-3.5 was quickly eclipsed by GPT-4 just a few months later. GPT-4 crushed GPT-3.5 on a range of benchmarks, including its performance on the bar exam (GPT-4 scored in the 90th percentile; GPT-3.5 in the 10th).


Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

WIRED

The National Institute of Standards and Technology (NIST) has issued new instructions to scientists that partner with the US Artificial Intelligence Safety Institute (AISI) that eliminate mention of "AI safety," "responsible AI," and "AI fairness" in the skills it expects of members and introduces a request to prioritize "reducing ideological bias, to enable human flourishing and economic competitiveness." The information comes as part of an updated cooperative research and development agreement for AI Safety Institute consortium members, sent in early March. Previously, that agreement encouraged researchers to contribute technical work that could help identify and fix discriminatory model behavior related to gender, race, age, or wealth inequality. Such biases are hugely important because they can directly affect end users and disproportionately harm minorities and economically disadvantaged groups. The new agreement removes mention of developing tools "for authenticating content and tracking its provenance" as well as "labeling synthetic content," signaling less interest in tracking misinformation and deep fakes.


FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

Anagnostidis, Sotiris, Bachmann, Gregor, Kim, Yeongmin, Kohler, Jonas, Georgopoulos, Markos, Sanakoyeu, Artsiom, Du, Yuming, Pumarola, Albert, Thabet, Ali, Schönfeld, Edgar

arXiv.org Artificial Intelligence

Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we revisit the conventional static paradigm that allocates a fixed compute budget per denoising iteration and propose a dynamic strategy instead. Our simple and sample-efficient framework enables pre-trained DiT models to be converted into \emph{flexible} ones -- dubbed FlexiDiT -- allowing them to process inputs at varying compute budgets. We demonstrate how a single \emph{flexible} model can generate images without any drop in quality, while reducing the required FLOPs by more than $40$\% compared to their static counterparts, for both class-conditioned and text-conditioned image generation. Our method is general and agnostic to input and conditioning modalities. We show how our approach can be readily extended for video generation, where FlexiDiT models generate samples with up to $75$\% less compute without compromising performance.


Revisiting Robust RAG: Do We Still Need Complex Robust Training in the Era of Powerful LLMs?

Ding, Hanxing, Tao, Shuchang, Pang, Liang, Wei, Zihao, Chen, Liwei, Xu, Kun, Shen, Huawei, Cheng, Xueqi

arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) systems often suffer from performance degradation when encountering noisy or irrelevant documents, driving researchers to develop sophisticated training strategies to enhance their robustness against such retrieval noise. However, as large language models (LLMs) continue to advance, the necessity of these complex training methods is increasingly questioned. In this paper, we systematically investigate whether complex robust training strategies remain necessary as model capacity grows. Through comprehensive experiments spanning multiple model architectures and parameter scales, we evaluate various document selection methods and adversarial training techniques across diverse datasets. Our extensive experiments consistently demonstrate that as models become more powerful, the performance gains brought by complex robust training methods drop off dramatically. We delve into the rationale and find that more powerful models inherently exhibit superior confidence calibration, better generalization across datasets (even when trained with randomly selected documents), and optimal attention mechanisms learned with simpler strategies. Our findings suggest that RAG systems can benefit from simpler architectures and training strategies as models become more powerful, enabling more scalable applications with minimal complexity.


Effective Mitigations for Systemic Risks from General-Purpose AI

Uuk, Risto, Brouwer, Annemieke, Schreier, Tim, Dreksler, Noemi, Pulignano, Valeria, Bommasani, Rishi

arXiv.org Artificial Intelligence

The systemic risks posed by general-purpose AI models are a growing concern, yet the effectiveness of mitigations remains underexplored. Previous research has proposed frameworks for risk mitigation, but has left gaps in our understanding of the perceived effectiveness of measures for mitigating systemic risks. Our study addresses this gap by evaluating how experts perceive different mitigations that aim to reduce the systemic risks of general-purpose AI models. We surveyed 76 experts whose expertise spans AI safety; critical infrastructure; democratic processes; chemical, biological, radiological, and nuclear risks (CBRN); and discrimination and bias. Among 27 mitigations identified through a literature review, we find that a broad range of risk mitigation measures are perceived as effective in reducing various systemic risks and technically feasible by domain experts. In particular, three mitigation measures stand out: safety incident reports and security information sharing, third-party pre-deployment model audits, and pre-deployment risk assessments. These measures show both the highest expert agreement ratings (>60\%) across all four risk areas and are most frequently selected in experts' preferred combinations of measures (>40\%). The surveyed experts highlighted that external scrutiny, proactive evaluation and transparency are key principles for effective mitigation of systemic risks. We provide policy recommendations for implementing the most promising measures, incorporating the qualitative contributions from experts. These insights should inform regulatory frameworks and industry practices for mitigating the systemic risks associated with general-purpose AI.


Anthropic CEO Dario Amodei on Being an Underdog, AI Safety, and Economic Inequality

TIME - Tech

Hanging on the wall of Anthropic's offices in San Francisco in early May, a stone's throw from the conference room where CEO Dario Amodei would shortly sit for an interview with TIME, was a framed meme. Its single panel showed a giant robot ransacking a burning city. Underneath, the image's tongue-in-cheek title: Deep learning is hitting a wall. That's a refrain you often hear from AI skeptics, who claim that rapid progress in artificial intelligence will soon taper off. Another points to the devastated city: "wall."